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(54) VOICE DRIVEN MOUTH ANIMATION SYSTEM 



(57) In a synchronization control apparatus, a voice- 
language-information generating section generates the 
voice language information of a word which a robot ut- 
ters. A voice synthesizing section calculates phoneme 
information and a phoneme continuation period accord- 
ing to the voice language information, and also gener- 
ates synthesized-voice data according to an adjusted 



phoneme continuation period. An articulation-operation 
generating section calculates an articulation-operation 
period according to the phoneme information. A voice- 
operation adjusting section adjusts the phoneme con- 
tinuation period and the articulation-operation period. 
An articulation-operation executing section operates an 
organ of articulation according to the adjusted articula- 
tion-operation period. 
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Description 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 5 

[0001] The present invention relates to synchroniza- 
tion control apparatuses, synchronization control meth- 
ods, and recording media. For example, the present in- 
vention relates to a synchronization control apparatus, 
a synchronization control method, and a recording me- 
dium suited to a case in which synthesized-voice out- 
puts are synchronized with the operations of a portion 
which imitates the motions of an organ of articulation 
and which is provided for the head of a robot. 

2. Description of the Related Art 

[0002] Some robots which imitate human beings or 
animals have movable portions (such as a portion sim- 
ilar to a mouth which opens or closes when the jaws 
open and close) which imitate mouths, jaws, and the 
like. Others output voices while operating mouths, jaws, 
and the like. 

[0003] When such robots operate the mouths and the 
like correspondingly to uttered words such that, for ex- 
ample, the mouths and the like have a shape in which 
human beings utter a sound of "a," at the output timing 
of a sound of "a," and have a shape in which human 
beings utter a sound of "i," at the output timing of a sound 
of "i," the robots imitate human beings more real. How- 
ever, such robots have not yet been created. 

SUMMARY OF THE INVENTION 

[0004] The present invention has been made in con- 
sideration of the foregoing condition. Accordingly, an ob- 
ject of the present invention is to implement a robot 
which imitates a human being more real in a way in 
which the operation of a portion which imitates an organ 
of articulation corresponds to uttered words generated 
by voice synthesis at utterance timing. 
[0005] The foregoing object is achieved in one aspect 
of the present invention through the provision of a syn- 
chronization control apparatus for synchronizing the 
output of a voice signal and the operation of a movable 
portion, including phoneme-information generating 
means for generating phoneme information formed of a 
plurality of phonemes by using language information; 
calculation means for calculating a phoneme continua- 
tion period (i.e. duration) according to thephonem infor- 
mation generated by the phoneme- information generat- 
ing means; computing means for computing the opera- 
tion period of the movable portion according to the pho- 
neme information generated by the phoneme- informa- 
tion generating means; adjusting means for adjusting 
the phoneme continuation period calculated by the cal- 
culation means and the operation period computed by 



the computing means; synthesized-votce-information 
generating means for generating synthesized-voice in- 
formation according to the phoneme continuation period 
adjusted by the adjusting means; synthesizing means 
for synthesizing the voice signal according to the syn- 
thesized-voice information generated by the synthe- 
sized-voice-information generating means; and opera- 
tion control means for controlling the operation of the 
movable portion according to the operation period ad- 
justed by the adjusting means. 

[0006] The synchronization control apparatus may be 
configured such that the adjusting means compares the 
phoneme continuation period and the operation period 
corresponding to each of the phonemes and performs 
adjustment by substituting whichever is the longer for 
the shorter. 

[0007] The synchronization control apparatus may be 
configured such that the adjusting means performs ad- 
justment by synchronizing at least one of the start timing 
and the end timing, of the phoneme continuation period 
and the operation period corresponding to any of the 
phonemes. 

[0008] The synchronization control apparatus may be 
configured such that the adjusting means performs ad- 
justment by substituting one of the phoneme continua- 
tion period and the operation period corresponding to 
all of the phonemes, for the other. 
[0009] The synchron ization control apparatus may be 
configured such that the adjusting means performs ad- 
justment by synchronizing at least one of the start timing 
and the end timing, of the phoneme continuation period 
and the operation period corresponding to each of the 
phonemes, and by placing no-process periods at lack- 
ing intervals. 

[001 0] The synchron ization control apparatus may be 
configured such that the adjusting means compares the 
phoneme continuation period and the operation period 
corresponding to all of the phonemes and performs ad- 
justment by extending whichever is the shorter in pro- 
portion. 

[001 1 ] The synchron ization control apparatus may be 
configured such that the operation control means con- 
trols the operation of the movable portion which imitates 
the operation of an organ of articulation of an animal. 
[0012] The synchronization control apparatus may 
further comprise detection means for detecting an ex- 
ternal force operation applied to the movable portion. 
[0013] The synchronization control apparatus may be 
configured such that at least one of the synthesizing 
means and the operation control means changes a 
process currently being executed, in response to a de- 
tection result obtained by the detection means. 
[0014] The synchronization control apparatus may be 
a robot. 

[001 5] The foregoing object is achieved in another as- 
pect of . the present invention through the provision of 
a synchronization control method of synchronizing the 
output of a voice signal and the operation of a movable 



15 



20 



25 



30 



35 



40 



45 



50 



2 



1113422A2 I > 



3 



EP1 113 422 A2 



4 



portion, including a phoneme-information generating 
step of generating phoneme information formed of a plu- 
rality of phonemes by using language information; a cal- 
culation step of calculating a phoneme continuation pe- 
riod according to the phoneme information generated in s 
the phoneme-information generating step; a computing 
step of computing the operation period of the movable 
portion according to the phoneme information generat- 
ed in the phoneme-information generating step; an ad- 
justing step for adjusting the phoneme continuation pe- 
riod calculated in the calculation step and the operation 
period computed in the computing step; a synthesized- 
voice-information generating step of generating synthe- 
sized-voice information according to the phoneme con- 
tinuation period adjusted in the adjusting step; a synthe- 
sizing step of synthesizing the voice signal according to 
the synthesized-voice information generated in the syn- 
thesized-voice-information generating step; and an op- 
eration control step of controlling the operation of the 
movable portion according to the operation period ad- 
justed in the adjusting step. 

[0016] The foregoing object is achieved in still another 
aspect of the present invention through the provision of 
a recording medium storing a computer-readable pro- 
gram for synchronizing the output of a voice signal and 
the operation of a movable portion, the program includ- 
ing a phoneme-information generating step of generat- 
ing phoneme information formed of a plurality of pho- 
nemes by using language information ; a calculation step 
of calculating a phoneme continuation period according 
to the phoneme information generated in the phoneme- 
information generating step; a computing step of com- 
puting the operation period of the movable portion ac- 
cording to the phoneme information generated in the 
phoneme-information generating step; an adjusting 
step for adjusting the phoneme continuation period cal- 
culated in the calculation step and the operation period 
computed in the computing step; a synthesized-voice- 
information generating step of generating synthesized- 
voice information according to the phoneme continua- 
tion period adjusted in the adjusting step; a synthesizing 
step of synthesizing the voice signal according to the 
synthesized-voice information generated in the synthe- 
sized-voice-information generating step; and an opera- 
tion control step of controlling the operation of the mov- 
able portion according to the operation period adjusted 
in the adjusting step. 

[0017] In a synchronization control apparatus, a syn- 
chronization control method, and a program stored in a 
recording medium according to the present invention, 
phoneme information formed of a plurality of phonemes 
is generated by using language information, and a pho- 
neme continuation period is calculated according to the 
generated phoneme information. The operation period 
of a movable portion is also computed according to the 
generated phoneme information. The calculated pho- 
neme continuation period and the computed operation 
period are adjusted, synthesized-voice information is 



generated according to the adjusted phoneme continu- 
ation period, and a voice signal is synthesized according 
to the generated synthesized-voice information. In ad- 
dition, the operation of the movable portion is controlled 
according to the adjusted operation period. 
[0018] As described above, according to a synchroni- 
zation control apparatus, a synchronization control 
method, and a program stored in a recording medium 
of the present invention, phoneme information formed 
of a plurality of phonemes is generated by using lan- 
guage information, a phoneme continuation period and 
the operation period of a movable portion are calculated 
according to the generated phoneme information, the 
phoneme continuation period and the operation period 
are adjusted, and the operation of the movable portion 
is controlled according to the adjusted operation period. 
Therefore, a word to be uttered by voice synthesis at 
utterance timing can be synchronized with the operation 
of a portion which imitates an organ of articulation, and 
a more real robot is implemented. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0019] Fig. 1 is a block diagram showing an example 
structure of a section controlling the operation of a por- 
tion which imitates an organ of articulation and control- 
ling the voice outputs of a robot to which the present 
invention is applied. 

[0020] Fig. 2 is a view showing example phoneme in- 
formation and an example phoneme continuation peri- 
od. 

[0021] Fig. 3 is a view showing example articulation- 
operation instructions and example articulation-opera- 
tion periods. 

[0022] Fig. 4 is a view showing an example of adjust- 
ed phoneme continuation periods. 
[0023] Fig. 5 is a flowchart showing the operation of 
the robot to which the present invention is applied. 
[0024] Figs. 6A and 6B show an example of a pho- 
neme continuation period and that of an articulation -op- 
eration period corresponding to each other respective- 
ly. 

[0025] Fig. 7 is a view showing the phoneme contin- 
uation period and the articulation-operation period ad- 
justed by a first method. 

[0026] Fig. 8 is a view showing the phoneme contin- 
uation period and the articulation -operation period ad- 
justed by a second method. 

[0027] Figs. 9A and 9B show the phoneme continua- 
tion period and the articulation-operation period adjust- 
ed by a third method, respectively. 
[0028] Fig. 1 0 is a view showing the phoneme contin- 
uation period and the articulation-operation period ad- 
justed by a fourth method. 

[0029] Fig. 1 1 is a view showing the phoneme contin- 
uation period and the articulation-operation period ad- 
justed by a fifth method. 

[0030] Figs. 12A and 12B show examples in which 
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phoneme information is synchronized with the opera- 
tions of portions other than the organs of articulation. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 

[0031] Fig. 1 shows an example structure of a section 
controlling the operation of a portion which imitates an 
organ of articulation, such as jaws, lips, a throat, a 
tongue, or nostrils, and controlling the voice outputs of 
a robot to which the present invention is applied. This 
example structure is, for example, provided forthe head 
of the robot. 

[0032] An input section 1 includes a microphone and 
a voice recognition function (neither part shown), and 
converts a voice signal (words which the robot is made 
to repeat, such as "konnichiwa" (meaning hello in Jap- 
anese), or words spoken to the robot) input to the mi- 
crophone to text data by the voice recognition function 
and sends it to a voice-language- information generating 
section 2. Text data may be externally input to the voice- 
language-information generating section 2. 
[0033] When the robot has a dialogue, the voice-lan- 
guage-information generating section 2 generates the 
voice language information (indicating a word to be ut- 
tered) of a word to be uttered as a response to the text 
data input from the input section 1 , and outputs it to a 
control section 3. The voice-language-information gen- 
erating section 2 outputs the text data input from the in- 
put section 1 as is to the control section 3 when the robot 
is made to perform repetition. Voice language informa- 
tion is expressed by text data, such as Japanese Kana 
letters, alphabetical letters, and phonetic symbols. 
[0034] The control section 3 controls a drive 11 so as 
to read a control program stored in a magnetic disk 12, 
an optical disk 13, a magneto-optical disk 14, or a sem- 
iconductor memory 15, and controls each section ac- 
cording to the read control program. 
[0035] More specifically, the control section 3 sends 
the text data input as the voice language information 
from the voice-language-information generating section 
2, to a voice synthesizing section 4; sends phoneme in- 
formation output from the voice synthesizing section 4, 
to an articulation-operation generating section 5; and 
sends an articulation-operation period output from the 
articulation-operation generating section 5 and the pho- 
neme information and a phoneme continuation period 
output from the voice synthesizing section 4, to a voice- 
operation adjusting section 6. The control section 3 also 
sends an adjusted phoneme continuation period output 
from the voice-operation adjusting section 6, to the voice 
synthesizing section 4, and an adjusted articulation-op- 
eration period output from the voice-operation adjusting 
section 6 to an articulation-operation executing section 
7. The control section 3 further sends synthesized-voice 
data output from the voice synthesizing section 4, to a 
voice output section 9. The control section 3 furthermore 
halts, resumes, or stops the processing of the articula- 
tion-operation executing section 7 and the voice output 



section 9 according to detection information output from 
an external sensor 8. 

[0036] The voice synthesizing section 4 generates 
phoneme information ("KOXNICHIWA" in this case) 

5 from the text data (such as "konnichiwa") output from 
the voice-language-information generating section 2 as 
voice language information, which is input from the con- 
trol section 3, as shown in Fig. 2; calculates the pho- 
neme continuation period of each phoneme;and outputs 

10 it to the control section 3. The voice synthesizing section 

4 also generates synthesized voice data according to 
the adjusted phoneme continuation period output from 
the voice-operation adjusting section 6 { which is input 
from the control section 3. The generated synthesized 

15 voice data includes synthesized-voice data generated 
according to a rule, which is generally known, and data 
reproduced from recorded voices. 
[0037] The articulation-operation generating section 

5 calculates the articulation-operation instruction (in- 
20 struction for instructing the operation of a portion which 

imitates each organ of articulation) corresponding to 
each phoneme and an articulation-operation period in- 
dicating the period of the operation, as shown in Fig. 3, 
according to the phoneme information output from the 

25 voice synthesizing section 4, which is input from the con- 
trol section 3, and outputs them to the control section 3. 
In an example shown in Fig. 3, jaws, lips, a throat, a 
tongue, and nostrils serve as organs 16 of articulation. 
Articulation-operation instructions include those for the 

30 up or down movement of the jaws, the shape change 
and the open or close operation of the lips, the front or 
back, up or down, and left or right movements of the 
tongue, the amplitude and the up or down movement of 
the throat, and a change in shape of the nose. An artic- 

35 ulation-operation instruction may be independently sent 
to one of the organs 16 of articulation. Alternatively, ar- 
ticulation-operation instructions may be sent to a com- 
bination of a plurality of organs 16 of articulation. 
[0038] The voice-operation adjusting section 6 ad- 

40 justs the phoneme continuation period output from the 
voice synthesizing section 4 and the articulation -opera- 
tion period output from the articulation-operation gener- 
ating section 5, which are input from the control section 
3, according to a predetermined method (details thereof 

45 will be described later), and outputs to the control sec- 
tion 3. When the phoneme continuation period shown in 
Fig. 2 and the articulation-operation period shown in Fig. 
3 are adjusted according to a method in which whichev- 
er is the longer is substituted for the shorter for each 

50 phoneme in the phoneme continuation period and the 
articulation-operation period, for example, the phoneme 
continuation period of each of the phonemes "X," "I," 
and "W" is extended so as to be equal to the correspond- 
ing articulation-operation period. 

55 [0039] The articulation-operation executing section 7 
operates an organ 1 6 of articulation according to an ar- 
ticulation-operation instruction output from the articula- 
tion-operation generating section 5 and the adjusted ar- 
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ticulation-operation period output from the articulation- 
operation adjusting section 6, which are input from the 
control section 3. 

[0040] The externa! sensor 8 is provided, for example, 
inside the mouth, which is included in the organ 16 of 
articulation, detects an object inserted into the mouth, 
and outputs detection information to the control section 
3. 

[0041] The voice output section 9 makes a speaker 
10 produce the voice corresponding to the synthesized 
voice data output from the voice synthesizing section 4, 
which is input from the control section 3. 
[0042] The organ 1 6 of articulation is a movable por- 
tion provided for the head of the robot, which imitates 
jaws, lips, a throat, a tongue, nostrils, and the like. 
[0043] The operation of the robot will be described 
next by referring to a flowchart shown in Fig. 5. In step 
S1 , a voice signal input to the microphone of the input 
section 1 is converted to text data and sent to the voice- 
language-information generating section 2. In step S2, 
the voice-language-information generating section 2 
outputs the voice language information corresponding 
to the text data input from the input section 1 , to the con- 
trol section 3. The control section 3 sends the text data 
(for example, "konnichiwa") serving as the voice lan- 
guage information input from the voice-language-infor- 
mation generating section 2, to the voice synthesizing 
section 4. 

[0044] In step S3, the voice synthesizing section 4 
generates phoneme information (in this case, 
"KOXNICHIWA") from the text data serving as the voice 
language information output from the voice-language- 
information generating section 2, which is sent from the 
control section 3; calculates the phoneme continuation 
period of each phoneme; and outputs to the control sec- 
tion 3. The control section 3 sends the phoneme infor- 
mation output from the voice synthesizing section 4, to 
the articulation-operation generating section 5. 
[0045] In step S4, the articulation-operation generat- 
ing section 5 calculates the articulation -ope rati on in- 
struction and articulation-operation period correspond- 
ing to each phoneme according to the phoneme infor- 
mation output from the voice synthesizing section 4, 
which is sent from the control section 3, and outputs 
them to the control section 3. The control section 3 
sends the articulation-operation period output from the 
articulation-operation generating section 5 and the pho- 
neme information and the phoneme continuation period 
output from the voice synthesizing section 4, to the 
voice-operation adjusting section 6. 
[0046] In step S5, the voice-operation adjusting sec- 
tion 6 adjusts the phoneme continuation period output 
from the voice synthesizing section 4 and the articula- 
tion-operation period output from the articulation-oper- 
ation generating section 5, which are sent from the con- 
trol section 3, according to a predetermined rule, and 
outputs to the control section 3. 
[0047] Firstto fifth methods for adjusting the phoneme 



continuation period and the articulation-operation peri- 
od will be described here by referring to Figs. 6A, 6B, 7, 
8, 9A, 9B, 10, and 11. In the following description, it is 
assumed that the phoneme continuation period gener- 
5 ated in step S3 is shown in Fig. 6A and the articulation- 
operation period generated in step S4 is shown in Fig. 
6B. 

[0048] In the first method, the phoneme continuation 
period and the articulation -operation period of each pho- 
10 neme are compared, and whichever is the longer is used 
to substitute for the shorter. Fig. 7 shows an adjustment 
result obtained by the first method. In examples shown 
in Figs. 6A and 6B, since the phoneme continuation pe- 
riod of each of the phonemes "K," 0 CH, H and "W" is long- 
's er than the corresponding articulation -ope ration period, 
the articulation -operation period is substituted for the 
phoneme continuation period as shown in (B) of Fig. 7. 
Conversely, since the articulation-operation period of 
each of the phonemes "O," "X," "N," "I," "I," and "A" is 
20 longer than the corresponding phoneme continuation 
period, the phoneme continuation period is substituted 
for the articulation-operation period as shown in (A) of 
Fig. 7. 

[0049] In the second method, the start timing or the 

25 end timing of any phoneme is synchronized. Fig. 8 
shows an adjustment result obtained by the second 
method. When synchronization is achieved at the start 
timing of the phoneme "X," as shown in Fig. 8, data lacks 
before the starting timing of the phoneme continuation 

30 period of the phoneme "K" and after the end timing of 
the phoneme continuation period of the phoneme "A." 
Adjustment is achieved such that voices are not uttered 
at the data-lacked portions and only articulation opera- 
tions are performed. 

35 The user may specify the phoneme at which the start 
timing is synchronized. Alternatively, the control section 
3 may determine according to a predetermined rule. 
[0050] In the third method, either the phoneme con- 
tinuation period or the articulation -ope ration period is 

40 used for all phonemes. Fig. 9 shows an adjustment re- 
sult obtained by the third method in a case in which the 
articulation-operation period has priority and the articu- 
lation-operation period is substituted for the phoneme 
continuation period for all phonemes. The user may 

45 specify which of the phoneme continuation period and 
the articulation-operation period has priority. Alterna- 
tively, the control section 3 may select either of them 
according to a predetermined rule. 
[0051] Inthefourth method, the starttiming or the end 

50 timing of each phoneme is synchronized between the 
phoneme continuation period and the articulation-oper- 
ation period, and blanks are placed at lacking periods 
of time (indicating periods when neither utterance nor 
an articulation operation is performed). Fig. 10 shows 

55 an adjustment result obtained by the fourth method. A 
blank is placed at a lacking period of time generated be- 
fore the start timing of the phoneme "K" in the articula- 
tion-operation period as shown in (B) of Fig. 10, and 
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blanks are placed at lacking periods of time generated 
before the starting timing of the phonemes "O," "X," "N, 
" and T in the phoneme continuation period, as shown 
in (A) of Fig. 10. 

[0052] In the fifth method, the start timing or the end 5 
timing of the phoneme located at the center of the pho- 
neme information is synchronized, the entire phoneme 
continuation period and the entire articulation-operation 
period are compared, and the shorter period is extended 
so that it has the same length as the longer. More spe- 10 
cifically, for example, as shown in Fig. 11 , the start timing 
of the phoneme "I" located at the center of the phoneme 
information "KOXNICHIWA" is synchronized and the 
phoneme continuation period is extended to 550 ms 
since the entire phoneme continuation period (300 ms) *s 
is shorter in time than the articulation-operation period 
(550 ms). Further specifically, the phoneme continua- 
tion period of each of the phonemes "K," "O," "X," and 
"N," which are located before the phoneme "I," is twice 
(= 300/150) extended, and the phoneme continuation 20 
period of each of the phonemes "I," "CH," "I" "W," and 
"A," which are located after the phoneme"!," is extended 
by a factor of 1 .25 (= 250/200). 

[0053] As described above, the phoneme continua- 
tion period and the articulation-operation period are ad- 25 
justed by one of the first to fifth methods, or by a com- 
bination of the first to fifth methods, and sent to the con- 
trol section 3. 

[0054] Back to Fig. 5, in step S6, the control section 
3 sends the adjusted phoneme continuation period out- 30 
put from the voice-operation adjusting section 6, to the 
voice synthesizing section 4, and sends the adjusted ar- 
ticulation-operation period output from the voice-opera- 
tion adjusting section 6 and the articulation-operation in- 
struction output from the articulation-operation generat- 35 
ing section 5, to the articulation-operation executing 
section 7. The voice synthesizing section 4 generates 
synthesized voice data according to the adjusted pho- 
neme continuation period output from the voice-opera- 
tion adjusting section 6, which is input from the control *o 
section 3, and outputs it to the control section 3. The 
control section 3 also sends the synthesized voice data 
output from the voice synthesizing section 4 to the voice 
output section 9. The voice output section 9 makes the 
speaker produce the voice corresponding to the synthe- 45 
sized voice data output from the voice synthesizing sec- 
tion 4, which is input from the control section 3. In syn- 
chronization with this operation, the articulation-opera- 
tion executing section 7 operates the organ 16 of artic- 
ulation according to the articulation-operation instruc- so 
tion output from the articulation-operation generating 
section 5 and the adjusted articulation-operation period 
output from the voice-operation adjusting section 6, 
which are input from the control section 3. 
[0055] Since the robot is operated as described 55 
above, the robot imitates the utterance operations of hu- 
man beings and animals more natural. 
[0056] When the external sensor 8 detects an object 



inserted into the mouth, which is included in the organ 
16 of articulation, during the process of step S6, detec- 
tion information is sent to the control section 3. The con- 
trol section 3 halts, resumes, or stops the processing of 
the articulation-operation executing section 7 and the 
voice output section 9 according to the detection infor- 
mation. With this operation, since a voice cannot be ut- 
tered when the object is inserted into the mouth, reality 
is enhanced. In addition to a case in which the detection 
information is sent from the external sensor 8, when the 
operation of the organ 16 of articulation is disturbed by 
some external force, the processing of the voice output 
section 9 may be halted, resumed, or stopped. 
[0057] In such a control, utterance processing is 
changed in response to a change of an articulation op- 
eration. Conversely, control may be executed such that 
an articulation operation is changed in response to a 
change of utterance processing, such as in a case in 
which an articulation operation is immediately changed 
when a word to be uttered is suddenly changed. 
[0058] In the present embodiment, the output of the 
voice-language-information generating section 2 is set 
to text data, such as "konnichiwa." It may be phoneme 
information, such as "KOXNICHIWA." 
[0059] The present invention can also be applied to a 
case in which the phonemes of an uttered word are syn- 
chronized with the operation of a portion other than the 
organs of articulation. In other words, the present inven- 
tion can be applied, for example, to a case in which the 
phonemes of an uttered word are synchronized with the 
operation of a neck or the operation of a hand, as shown 
in Fig. 12. 

[0060] In addition to robots, the present invention can 
further be applied to a case in which the phonemes of 
words uttered by a character expressed by computer 
graphics are synchronized with the operation of the 
character. 

[0061] The above-described series of processing can 
be executed by software as well as by hardware. When 
the series of processing is executed by software, the 
program constituting the software is installed from a re- 
cording medium into a computer built in a special hard- 
ware or into a general-purpose personal computer 
which executes various functions with installed various 
programs. 

[0062] This recording medium can be a package me- 
dium storing the program and distributed to the user to 
provide the program separately from the computer, such 
as a magnetic disk 1 2 (including a floppy disk) , an optical 
disk 13 (including a compact disk-read only memory 
(CD-ROM) and a digital versatile disk (DVD)), an mag- 
neto-optical disk 14 (including a Mini disk (MD)), or a 
semiconductor memory 15. In addition, the recording 
medium can be a ROM or a hard disk storing the pro- 
gram and distributed to the user in a condition in which 
it is placed in the computer in advance. 
[0063] In the present specification, steps describing 
the program which is stored in a recording medium in- 
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elude processes executed in a time-sequential manner 
according to the order of descriptions and also include 
processes executed not necessarily in a time-sequential 
manner but executed in parallel or independently. 



Claims 

1. A synchronization control apparatus for synchroniz- 
ing the output of a voice signal and the operation of to 
a movable portion, comprising: 

phoneme-information generating means for 
generating phoneme information formed of a 
plurality of phonemes by using language infor- fs 
mation; 

calculation means for calculating a phoneme 
continuation period according to the phoneme 
information generated by the phoneme-infor- 
mation generating means; 20 
computing means for computing the operation 
period of the movable portion according to the 
phoneme information generated by the pho- 
neme-information generating means; 
adjusting means for adjusting the phoneme 25 
continuation period calculated by the calcula- 
tion means and the operation period computed 
by the computing means; 
synthesized-voice-information generating 
means for generating synthesized-voice infor- 30 
mation according to the phoneme continuation 
period adjusted by the adjusting means; 
synthesizing means for synthesizing the voice 
signal according to the synthesized-voice infor- 
mation generated by the synthesized-voice-in- 35 
formation generating means; and 
operation control means for controlling the op- 
eration of the movable portion according to the 
operation period adjusted by the adjusting 
means. 40 

2. A synchronization control apparatus according to 
Claim 1 , wherein the adjusting means compares the 
phoneme continuation period and the operation pe- 
riod corresponding to each of the phonemes and 45 
performs adjustment by substituting whichever is 

the longer for the shorter. 

3. A synchronization control apparatus according to 
Claim 1 , wherein the adjusting means performs ad- so 
justment by synchronizing at least one of the start 
timing and the end timing, of the phoneme continu- 
ation period and the operation period correspond- 
ing to any of the phonemes. 

55 

4. A synchronization control apparatus according to 
Claim 1 , wherein the adjusting means performs ad- 
justment by substituting one of the phoneme con- 



tinuation period and the operation period corre- 
sponding to all of the phonemes, for the other. 

5. A synchronization control apparatus according to 
Claim 1 , wherein the adjusting means performs ad- 
justment by synchronizing at least one of the start 
timing and the end timing, of the phoneme continu- 
ation period and the operation period correspond- 
ing to each of the phonemes, and by placing no- 
process periods at lacking intervals. 

6. A synchronization control apparatus according to 
Claim 1 , wherein the adjusting means compares the 
phoneme continuation period and the operation pe- 
riod corresponding to all of the phonemes and per- 
forms adjustment by extending whichever is the 
shorter in proportion. 

7. A synchronization control apparatus according to 
Claim 1 , wherein the operation control means con- 
trols the operation of the movable portion which im- 
itates the operation of an organ of articulation of an 
animal. 

8. A synchronization control apparatus according to 
Claim 1 , further comprising detection means for de- 
tecting an external force operation applied to the 
movable portion. 

9. A synchronization control apparatus according to 
Claim 8, wherein at least one of the synthesizing 
means and the operation control means changes a 
process currently being executed, in response to a 
detection result obtained by the detection means. 

10. A synchronization control apparatus according to 
Claim 1 , wherein the synchronization control appa- 
ratus is a robot. 

11. A synchronization control method of synchronizing 
the output of a voice signal and the operation of a 
movable portion, comprising: 

a phoneme-information generating step of gen- 
erating phoneme information formed of a plu- 
rality of phonemes by using language informa- 
tion; 

a calculation step of calculating a phoneme 
continuation period according to the phoneme 
information generated in the phoneme-infor- 
mation generating step; 
a computing step of computing the operation 
period of the movable portion according to the 
phoneme information generated in the pho- 
neme-information generating step; 
an adjusting step for adjusting the phoneme 
continuation period calculated in the calcula- 
tion step and the operation period computed in 
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the computing step; 

a synthesized-voice-information generating 
step of generating synthesized-voice informa- 
tion according to the phoneme continuation pe- 
riod adjusted in the adjusting step; 5 
a synthesizing step of synthesizing the voice 
signal according to the synthesized-voice infor- 
mation generated in the synthesized-voice-in- 
formation generating step; and 
an operation control step of controlling the op- io 
eration of the movable portion according to the 
operation period adjusted in the adjusting step. 

12. A recording medium storing a computer-readable 

program for synchronizing the output of a voice sig- is 
nal and the operation of a movable portion, the pro- 
gram comprising: 

a phoneme-information generating step of gen- 
erating phoneme information formed of a plu- 20 
rality of phonemes by using language informa- 
tion; 

a calculation step of calculating a phoneme 
continuation period according to the phoneme 
information generated in the phoneme-infor- 25 
mation generating step; 

a computing step of computing the operation 
period of the movable portion according to the 
phoneme information generated in the pho- 
neme-information generating step; 30 
an adjusting step for adjusting the phoneme 
continuation period calculated in the calcula- 
tion step and the operation period computed in 
the computing step; 

a synthesized-voice-information generating 35 
step of generating synthesized-voice informa- 
tion according to the phoneme continuation pe- 
riod adjusted in the adjusting step; 
a synthesizing step of synthesizing the voice 
signal according to the synthesized-voice infor- *o 
mation generated in the synthesized-voice-in- 
formation generating step; and 
an operation control step of controlling the op- 
eration of the movable portion according to the 
operation period adjusted in the adjusting step. *s 
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